Administration System
Cluster Services
Cluster Services view
The Cluster Services view enables you to monitor the status of servers that comprise your N4 installation cluster across yards (if you have multiple yards).
You can monitor the progress of all services within the network and at the scope you are currently logged in. If you log in at the Operator or Complex level, the Cluster Services view is not available. If you log in at the Facility or Yard level, you see all the services available at the single yard level, as well as the services for multiple components that N4 and other components use. You can create a filtered view of the table to limit the display.
The columns list detailed information about each server. From the Actions menu, you can generate a report that compares the Bridge's in-memory model with the current database contents.
Access to the Cluster Services view is provided through the privilege: ADMIN_SYSTEM_MONITOR.
The N4 interface displays the name of the current N4 instance in the status bar next to the current scope.
Each yard installation has one XPS/Bridge host, a Center node host, a Standby Center node host, N4-cluster node hosts, and a database host. One or more XPS client workstations connect to the XPS server.
The Cluster Services view displays a record for each service and network port within the user's current scope. This includes a record for each N4 node in the cluster, the XPS server, and each XPS client. Some services, such as XPS and the Bridge daemon, present multiple services that are used by N4 or other components, and the list shows these services on separate rows.
The N4 setting, ARGOCORE003 (CLUSTER_SERVICE_REFRESH_FREQUENCY_IN_SECONDS) (on page 1) sets the interval for how frequently the Cluster Services view automatically refreshes. The default is 30 seconds.
When the first N4 Cluster node starts, only that first node appears in the Cluster Services view. Other cluster nodes appear in the Cluster Services view when they are started and when they join the cluster. Once the active N4 Center node is started, it removes all inactive services from the Cluster Services view, and displays only active services. Because the required start-up sequence is to start all of the Cluster nodes first, and then the Center node, after a shutdown, the first Cluster node may appear in the Cluster Services view along with any stale services that display as active but are actually inactive. Inactive services on the Cluster Services view stop appearing only when the active Center node is started.
The monitoring function in the Cluster Services view only considers ACTIVE and INACTIVE services based on the following descriptions:
An INACTIVE service does not necessarily mean the service is dead. It means that a heartbeat has not been received in the past two minutes. (The two minutes is a hard-coded value at this time). For the ClusterNode or BridgeDaemon type, service, DISCONNECTED means that node has a heartbeat, but the heartbeat hasn't reached the Center node for the past two minutes.
An INACTIVE cluster service can become ACTIVE, once the network issue is resolved.
An ACTIVE service may become INACTIVE, if there is any slowness in network.
You can remove nodes from the Cluster Services view if they have a status of INACTIVE or SHUTDOWN. If the heartbeat monitor detects a heartbeat from a node that has been deleted from the view, it appears again with the ACTIVE status. Select the node, and right click. From the menu that appears, Select Delete. N4 does not display the Delete option if the node is ACTIVE.
N4 does not allow you to delete nodes that are part of a Job Group.
For information on starting up and shutting down the N4 system, see startup and shutdown procedures.
XPS acquires all reference data from N4, and writes out a codes.txt file. In production, XPS runs in the 'bridged mode' (controlled with the persister_model setting in the server section of the settings.xml file). For testing purposes only, you run XPS in 'file' mode instead. When in 'file' mode, XPS instead uses codes.txt for the reference data, which it reads on startup.
The clusterServicesDiagnostics MBean replicates in your system monitoring tool, such as JConsole or Zabbix, the same information seen in the Cluster Services view. See more information in Administration Debug
Node Info Desk view
select node
Actions
Node Attributes
Diagnostics view for [node name]
clusterServiceDiagnosticsMBean by clicking each of the MBean's attributes. Details appear in the right pane.
Cluster Services view - Actions menu options
The Actions menu option is:
Perform Data Integrity Check: Generates a report that compares the Bridge daemon's in-memory model with the current database contents for the user's current logged in facility. The summary section of the report is displayed in the form of a table that includes the count of differences found.
This command is only available when logged in at Yard scope, because it pertains to a specific yard.
The following table lists the various columns in the list view:
Column Name |
Details |
---|---|
ID |
The service ID, including the scope (Operator/Complex/Facility/Yard) in which the service operates, if relevant. You can only see objects at a scope lower than that which you are logged into. You can filter the table to select a specific scope of items to view. |
Name |
The name of the node. Some nodes run multiple services.
|
Type |
The type of service provided For example, XPS provides a gate service for N4, and a message service for ECN4. For information about specific services within each type see the List of Known Services table, below.
|
Status |
The latest self-reported state of the service. Normally a running service is ACTIVE. When it is starting up it may go through phases such as STARTING, INITIALIZING, or LOADING. When a service shuts down cleanly, it sets the state to SHUTDOWN. If a service shuts down unexpectedly, it does not have an opportunity to set the state SHUTDOWN, in which case it may appear ACTIVE, but closer inspection will show its Heartbeat column is no longer updating. Some nodes (such as XPS client) just remove themselves from the list rather than stay there showing SHUTDOWN; this is the case for clients because they come and go frequently and do not provide services to other nodes. The self-reported statuses of the different services include:
|
Ack Delay |
The average time duration for the JMS message consuming processes. This column displays values for the Cluster Node and BridgeDaemon Types, only when you log in to N4 to the center node. This column can help you quickly identify the node that is processing slowly. There are two values shown:
On each cluster node, there are multiple consuming processes for JMS messages. For each consuming process, an average of the amount of time it takes for the center node to dispatch a message and to receive acknowledgment (ACK), is calculated for the last five messages within the past 10 minutes. The Ack Delay column shows the worst averages for the N4 and A4 consuming processes. When the worst average exceeds the threshold, the Ack Delay cell background is red. When you see Ack Delay turn red, you should check the health of that node. By default, the threshold is 500 milliseconds. You can change the threshold with these settings in the Settings view (Administration |
IP |
The IP address of the service, and if applicable, the port number on which it listens for connections. |
Port |
The port number on which the service listens for connections. Not all services provide a port for other services to talk to, so they do not display a port number. The various N4 messaging components use this information to locate each other. |
Version |
The build version of the service, if self-reported by the service. The format of this string varies depending on the technology on which the service is built. |
Info |
This column is a place where a service can post any information it wishes. Currently the N4CacheMaster nodes report a string containing some basic statistics about how that node views the cache.
The N4RESTWebService uses this column to display the web service path. |
Startup |
Date/time when the service was started. |
Heartbeat |
A timestamp emitted by the service at a regular interval (at least once per minute.) This is the time of the latest heartbeat posted by the service. In the event of a system crash, you can use this to determine the time the problem started. |
Activity |
Some services report the date/time that they last saw a journal entry/activity. |
Shutdown |
The time/date of when the state when into SHUTDOWN. Some service nodes (such as SPARCSClient) remove their row from the display rather than setting the state to SHUTDOWN and displaying a time stamp. |
User |
For the service Type SPARCSClient, the User ID of the logged-on user. Currently, the other service types do not have a user associated with them. |
Memory Used |
The amount of memory, in megabytes, the service or node is using. The ability to do this is specific to the technology the service uses. The JVM-based services (N4CacheMaster and Bridge services) report the JVM memory used and total size. XPS, when running on Windows, can report the operating system's total memory used and total memory size. |
Memory Max |
The total memory size, in megabytes, of the service or node. |
Monitoring Node |
The node that is currently performing the status updates for the Cluster Services view. |
List of Known Services
Type |
Name |
Description |
Implications If Not ACTIVE* ** |
---|---|---|---|
N4CacheMaster |
(administrator-assigned N4 node name) |
N4 interface to the cache for the yard; one per yard per N4 Node. |
If not present and ACTIVE, that N4 node is not running. |
N4BentoService |
(administrator-assigned N4 node name) |
N4 background job that processes messages sent by XPS or ECN4; one per yard (Starts automatically, or can be manually started in the N4 UI). |
If not present and ACTIVE, the Bento messages from XPS or ECN4 fail |
BridgeDaemon |
bridge |
The bridge daemon's primary service and cache interface. |
If not present and ACTIVE, bridge daemon is not running. |
BridgeService |
bridge |
The bridge daemon's listener to which XPS connects. |
If not present and ACTIVE, XPS cannot complete startup. |
BridgeControl |
bridge |
The bridge daemon's debug command line. Defaults to port 12000. |
-- for developer use only |
XPSDaemon |
xps |
The XPS server process. |
If not present and ACTIVE, XPS is not running. |
XPSControl |
xps |
The XPS server's debug command line. |
-- for developer use only |
N4GateService |
xps |
The XPS server's listener to which N4 connects to perform gate transactions. |
If not present and ACTIVE, N4 gate operations fail. |
XPSMessageService |
xps |
The XPS server's listener to which Live View clients connect. |
If not present and ACTIVE, Live View clients cannot connect. |
ExpertDeckerService (also known as "ECN4 XD server") |
(xps; or a SPARCS client IP address) |
The listener that processes decking requests; may run on XPS or on a SPARCS client. If XD is busy, ECN4 makes the decking request to XPS instead. |
If not present and ACTIVE, ECN4 relies on XPS to refine the target position for the containers upon dispatch, which has performance implications; also, ECN4 relies on XPS to offer an optimal position when rehandling containers--also a performance concern. |
XDService (also known as "N4 XD server") |
(XPS client IP address) |
The listener that processes vessel discharge decking requests from N4; runs on an XPS client. If a N4 XD server is not available, N4 sends the vessel discharge decking request to XPS instead. |
If not present and ACTIVE, N4 relies on XPS to assign positions in the yard for vessel discharges. Large terminals, after consultation with Navis, can use a Scaled N4 XD service to free up XPS processing bandwidth. |
SPARCSClient |
(XPS client IP address) |
An XPS client connected to XPS. |
Present when the client is running. |
ECN4Daemon |
ecn4 |
The ECN4 daemon's primary service and cache interface. |
If not present and ACTIVE, ECN4 is not running. |
ECN4WebService |
ECN4WebService |
The ECN4Web daemon's primary service. If there are two instances of ECN4Web service running (recommended for sites with more than 50 ECN4Web clients>), then the ID column in the N4 Cluster Services view displays both the ECN4Web instances appended with their respective 'IP address' and 'port ID'. It is possible to optionally set the scope for this service in the ECN4Web application.properties file. However, if you do not set it there, ECN4 intercepts the network node message and sets its own scope as the default. (Normally, ECN4Web is in the same scope as ECN4.) |
If not present and ACTIVE, ECN4Web is not running. |
N4RestWebService |
(administrator-assigned N4 node name) When configured with the load balancer, N4 creates a single service. When configured without a load balancer (recommended only for testing purposes), N4 creates a separate N4RestWebService for each network node. |
The geodetic service. A non-unique network node initialized based on the network topology of the current N4 installation. Spatial bin information is available using the REST web service. ECN4 uses this web service to get that information. Unlike other services, the N4RestWebService is a virtual node. For that reason, it requires configuration in the N4 Settings view (on page 1). See four settings with ID ARGORESTWEBSERVICE001 (LOAD_BALANCER_ENABLED) (on page 1) - 004. |
If not present and ACTIVE, a geodetic service is not running. |
XMLRDTService |
ecn4 |
The ECN4 daemon's listener for XMLRDT messages. If ACTIVE, shows the address/port ECN4 listens on for XMLRDT. |
If not present and ACTIVE, ECN4 does not accept XMLRDT messages. |
ECN4BentoServerService |
ecn4 |
The ECN4 daemon's listener for messages from XPS and XPS clients. If ACTIVE, shows the address/port ECN4 listens on for bento messages from XPS clients (E.g. dispatch, CHE reset, etc.), and XPS (E.g. load/discharge EC events). |
If not present and ACTIVE, ECN4 does not accept bento messages, and these will fail. |
KafkaServer |
(administrator-assigned name of the Kafka broker) |
One of the hosts for Apache Kafka messaging platform. |
If not present and ACTIVE, that Kafka broker is inactive or disconnected. If other KafkaServer hosts are active, N4 is running. |
* A service that is in states other than ACTIVE (SHUTDOWN, INITIALIZING, CONNECTED, LOADING, etc.) is as useless as if it was not present. ** Services are free to define their own states. Please contact your Navis representative if you have questions about the meaning of unlisted state labels. |